<!DOCTYPE html>
<html lang="en">
<head>
	<meta charset="UTF-8">
	<meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="width=device-width, initial-scale=1">
	<title>Token graphs | ElasticSearch 7.7 权威指南中文版</title>
	<meta name="keywords" content="ElasticSearch 权威指南中文版, elasticsearch 7, es7, 实时数据分析，实时数据检索" />
    <meta name="description" content="ElasticSearch 权威指南中文版, elasticsearch 7, es7, 实时数据分析，实时数据检索" />
    <!-- Give IE8 a fighting chance -->
    <!--[if lt IE 9]>
    <script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script>
    <script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
    <![endif]-->
	<link rel="stylesheet" type="text/css" href="../static/styles.css" />
	<script>
	var _link = 'token-graphs.html';
    </script>
</head>
<body>
<div class="main-container">
    <section id="content">
        <div class="content-wrapper">
            <section id="guide" lang="zh_cn">
                <div class="container">
                    <div class="row">
                        <div class="col-xs-12 col-sm-8 col-md-8 guide-section">
                            <div style="color:gray; word-break: break-all; font-size:12px;">原英文版地址: <a href="https://www.elastic.co/guide/en/elasticsearch/reference/7.7/token-graphs.html" rel="nofollow" target="_blank">https://www.elastic.co/guide/en/elasticsearch/reference/7.7/token-graphs.html</a>, 原文档版权归 www.elastic.co 所有<br/>本地英文版地址: <a href="../en/token-graphs.html" rel="nofollow" target="_blank">../en/token-graphs.html</a></div>
                        <!-- start body -->
                  <div class="page_header">
<strong>重要</strong>: 此版本不会发布额外的bug修复或文档更新。最新信息请参考 <a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html" rel="nofollow">当前版本文档</a>。
</div>
<div id="content">
<div class="breadcrumbs">
<span class="breadcrumb-link"><a href="index.html">Elasticsearch Guide [7.7]</a></span>
»
<span class="breadcrumb-link"><a href="analysis.html">Text analysis</a></span>
»
<span class="breadcrumb-link"><a href="analysis-concepts.html">Text analysis concepts</a></span>
»
<span class="breadcrumb-node">Token graphs</span>
</div>
<div class="navheader">
<span class="prev">
<a href="stemming.html">« Stemming</a>
</span>
<span class="next">
<a href="configure-text-analysis.html">Configure text analysis »</a>
</span>
</div>
<div class="section">
<div class="titlepage"><div><div>
<h2 class="title">
<a id="token-graphs"></a>Token graphs<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/analysis/token-graphs.asciidoc">edit</a>
</h2>
</div></div></div>
<p>When a <a class="xref" href="analyzer-anatomy.html#analyzer-anatomy-tokenizer" title="Tokenizer">tokenizer</a> converts a text into a stream of
tokens, it also records the following:</p>
<div class="ulist itemizedlist">
<ul class="itemizedlist">
<li class="listitem">
The <code class="literal">position</code> of each token in the stream
</li>
<li class="listitem">
The <code class="literal">positionLength</code>, the number of positions that a token spans
</li>
</ul>
</div>
<p>Using these, you can create a
<a href="https://en.wikipedia.org/wiki/Directed_acyclic_graph" class="ulink" target="_top">directed acyclic graph</a>,
called a <em>token graph</em>, for a stream. In a token graph, each position represents
a node. Each token represents an edge or arc, pointing to the next position.</p>
<div class="imageblock text-center">
<div class="content">
<img src="../static/images/analysis/token-graph-qbf-ex.svg" alt="token graph qbf ex">
</div>
</div>
<div class="section">
<div class="titlepage"><div><div>
<h3 class="title">
<a id="token-graphs-synonyms"></a>Synonyms<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/analysis/token-graphs.asciidoc">edit</a>
</h3>
</div></div></div>
<p>Some <a class="xref" href="analyzer-anatomy.html#analyzer-anatomy-token-filters" title="Token filters">token filters</a> can add new tokens, like
synonyms, to an existing token stream. These synonyms often span the same
positions as existing tokens.</p>
<p>In the following graph, <code class="literal">quick</code> and its synonym <code class="literal">fast</code> both have a position of
<code class="literal">0</code>. They span the same positions.</p>
<div class="imageblock text-center">
<div class="content">
<img src="../static/images/analysis/token-graph-qbf-synonym-ex.svg" alt="token graph qbf synonym ex">
</div>
</div>
</div>

<div class="section">
<div class="titlepage"><div><div>
<h3 class="title">
<a id="token-graphs-multi-position-tokens"></a>Multi-position tokens<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/analysis/token-graphs.asciidoc">edit</a>
</h3>
</div></div></div>
<p>Some token filters can add tokens that span multiple positions. These can
include tokens for multi-word synonyms, such as using "atm" as a synonym for
"automatic teller machine."</p>
<p>However, only some token filters, known as <em>graph token filters</em>, accurately
record the <code class="literal">positionLength</code> for multi-position tokens. This filters include:</p>
<div class="ulist itemizedlist">
<ul class="itemizedlist">
<li class="listitem">
<a class="xref" href="analysis-synonym-graph-tokenfilter.html" title="Synonym graph token filter"><code class="literal">synonym_graph</code></a>
</li>
<li class="listitem">
<a class="xref" href="analysis-word-delimiter-graph-tokenfilter.html" title="Word delimiter graph token filter"><code class="literal">word_delimiter_graph</code></a>
</li>
</ul>
</div>
<p>In the following graph, <code class="literal">domain name system</code> and its synonym, <code class="literal">dns</code>, both have a
position of <code class="literal">0</code>. However, <code class="literal">dns</code> has a <code class="literal">positionLength</code> of <code class="literal">3</code>. Other tokens in
the graph have a default <code class="literal">positionLength</code> of <code class="literal">1</code>.</p>
<div class="imageblock text-center">
<div class="content">
<img src="../static/images/analysis/token-graph-dns-synonym-ex.svg" alt="token graph dns synonym ex">
</div>
</div>
<div class="section">
<div class="titlepage"><div><div>
<h4 class="title">
<a id="token-graphs-token-graphs-search"></a>Using token graphs for search<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/analysis/token-graphs.asciidoc">edit</a>
</h4>
</div></div></div>
<p><a class="xref" href="analysis-index-search-time.html" title="Index and search analysis">Indexing</a> ignores the <code class="literal">positionLength</code> attribute
and does not support token graphs containing multi-position tokens.</p>
<p>However, queries, such as the <a class="xref" href="query-dsl-match-query.html" title="Match query"><code class="literal">match</code></a> or
<a class="xref" href="query-dsl-match-query-phrase.html" title="Match phrase query"><code class="literal">match_phrase</code></a> query, can use these graphs to
generate multiple sub-queries from a single query string.</p>
<details>
<summary class="title"><span class="strong strong"><strong>Example</strong></span></summary>
<div class="content">
<p>A user runs a search for the following phrase using the <code class="literal">match_phrase</code> query:</p>
<p><code class="literal">domain name system is fragile</code></p>
<p>During <a class="xref" href="analysis-index-search-time.html" title="Index and search analysis">search analysis</a>, <code class="literal">dns</code>, a synonym for
<code class="literal">domain name system</code>, is added to the query string’s token stream. The <code class="literal">dns</code>
token has a <code class="literal">positionLength</code> of <code class="literal">3</code>.</p>
<div class="imageblock text-center">
<div class="content">
<img src="../static/images/analysis/token-graph-dns-synonym-ex.svg" alt="token graph dns synonym ex">
</div>
</div>
<p>The <code class="literal">match_phrase</code> query uses this graph to generate sub-queries for the
following phrases:</p>
<div class="pre_wrapper lang-text">
<pre class="programlisting prettyprint lang-text">dns is fragile
domain name system is fragile</pre>
</div>
<p>This means the query matches documents containing either <code class="literal">dns is fragile</code> <em>or</em>
<code class="literal">domain name system is fragile</code>.</p>
</div>
</details>
</div>

<div class="section">
<div class="titlepage"><div><div>
<h4 class="title">
<a id="token-graphs-invalid-token-graphs"></a>Invalid token graphs<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/analysis/token-graphs.asciidoc">edit</a>
</h4>
</div></div></div>
<p>The following token filters can add tokens that span multiple positions but
only record a default <code class="literal">positionLength</code> of <code class="literal">1</code>:</p>
<div class="ulist itemizedlist">
<ul class="itemizedlist">
<li class="listitem">
<a class="xref" href="analysis-synonym-tokenfilter.html" title="Synonym token filter"><code class="literal">synonym</code></a>
</li>
<li class="listitem">
<a class="xref" href="analysis-word-delimiter-tokenfilter.html" title="Word delimiter token filter"><code class="literal">word_delimiter</code></a>
</li>
</ul>
</div>
<p>This means these filters will produce invalid token graphs for streams
containing such tokens.</p>
<p>In the following graph, <code class="literal">dns</code> is a multi-position synonym for <code class="literal">domain name
system</code>. However, <code class="literal">dns</code> has the default <code class="literal">positionLength</code> value of <code class="literal">1</code>, resulting
in an invalid graph.</p>
<div class="imageblock text-center">
<div class="content">
<img src="../static/images/analysis/token-graph-dns-invalid-ex.svg" alt="token graph dns invalid ex">
</div>
</div>
<p>Avoid using invalid token graphs for search. Invalid graphs can cause unexpected
search results.</p>
</div>

</div>

</div>
<div class="navfooter">
<span class="prev">
<a href="stemming.html">« Stemming</a>
</span>
<span class="next">
<a href="configure-text-analysis.html">Configure text analysis »</a>
</span>
</div>
</div>

                  <!-- end body -->
                        </div>
                        <div class="col-xs-12 col-sm-4 col-md-4" id="right_col">
                        
                        </div>
                    </div>
                </div>
            </section>
        </div>
    </section>
</div>
<script src="../static/cn.js"></script>
</body>
</html>